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ABSTRACT 

A common problem in educational research is measuring 
the degree of relationship or association between two variables. Many 
investigators habitually use Pearson's product- moment correlation 
coefficient or a transformation of x2. In the past two decades, 
however, a variety of association measures have been introduced in 
the statistics literature. This report contains a review of available 
association measures, supplemented by discussion of the several 
factors involved in selecting a measure of association such as the 
types of variables (continuous, ranked, ordered) and the type of 
association expected (linear, monotone, general) . Examples illustrate 
the necessary calculations and provide comparisons among the 
measures. (Author) 
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Introductory Statement 



The Center's mission is to improve teaching in American schools. 

Too many teachers still employ a didactic style aimed at filling passive 
students with facts. The teacher's environment often prevents him from 
changing his style, and may indeed drive him out of the profession. 

And the children of the poor typically suffer from the worst teaching. 

The Center uses the resources of the behavioral sciences in pur- 
suing its objectives. Drawing primarily upon psychology and sociology, 
but also upon other behavioral science disciplines, the Center has formu- 
lated programs of research, development, demonstration, and dissemination 
in three areas. Program 1, Teaching Effectiveness, is now developing a 
Model Teacher Training System that can be used to train both beginning 
and experienced teachers in effective. teaching skills. Program 2, The 
Environment for Teaching, is developing models of school organization 
and ways of evaluating teachers that will encourage teachers to become 
more professional and more committed. Program 3, Teaching Students from 
Low- Income Areas, is developing materials and procedures for motivating 
both students and teachers in low-income schools. 

Many research studies result in the computation of measures of asso- 
ciation between pairs of variables. This paper provides a review of the 
variety of measures available and explains the circumstances under which 
each is appropriate. This papei^ then, should help educational researchers 
make use of appropriate statistical methodology in studying relationships 
between variables. 
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Abstract 



A common problem in educational research is measuring 
the degree of relationship or association between two vari- 
ables. Many investigate.^ habitually use Pearson's product- 
moment correlation coefficient or a transformation of 
In the past two decades, however, a variety of association 
measures have been introduced in the statistics literature. 
This report contains a review of available association mea- 
sures, supplemented by discussion of the several factors in- 
volved in selecting a measure of association, such as the 
types of variables (continuous, ranked, ordered) and the 
type of association expected (linear, monotone, general). 
Examples illustrate the necessary calculations and provide 
comparisons among the measures. 
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MEASURES OF ASSOCIATION 
Janet Dixon Elashof f and Charles R. Dunbar 

Introduction 

A common problem in educational research is that of measuring the 
degree of relationship or association between two variables. For exam- 
ple, an investigator might wish to estimate the degree to which achieve- 
ment scores can be predicted from I.Q scores or the degree of agreement 
between two raters using the same five-point scale. A wide variety of 
association measures is available — Pearson's r, Kendall's T, Goodman 
and Kruskal's y, and others. This study reviews some association mea- 
sures discussed in the statistical literature and offers guidelines for 
making an appropriate choice of measure for a particular problem. 

Each measure of association was developed to be applied in a par- 
ticular class of problems. Thus, in our review we pay special attention 
to the several major factors determining the type of problem for which 
a measure was designed: 

1. What type of measurement scale do the two variables represent? 

2. Does one variable logically precede the other; that is, will 
one variable be used to predict the other and should the 
measure of association reflect this? 

3. What type of relationship between the two variables is the 
measure sensitive to? 
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4. What sampling conditions and assumptions about the joint dis- 
tribution of the two variables are necessary for standard tests 
of significance to be valid? 

For this study, association measures have been grouped according to 
the type of measurement scale for which they are designed. Variables 
such as age or height are designated as continuous variables even though 
they are usually rounded off to the nearest year or inch. Discrete vari- 
ables are classified into four basic types: (a) rank-ordered values, 

observations ranked from 1 to n; (b) ordered multicategorical values, 
observations assigned scores such as 1, 2, 3, 4, 5; (c) unordered nomi- 
nal values; (d) dichotomous values. Although dichotomous variables can 
logically be included in type (b) or (c) , some measures of association 
have been developed specifically for them. 

Naturally, a variable that is intrinsically continuous could be 
turned into a ranked or ordered categorical variable, or for some pur- 
poses a variable of type (a), (b) , or (d) could be treated as continuous. 
Therefore, clcissification of variables by this scheme may be somewhat 
arbitrary and should serve mainly as a preliminary guide to choosing a 
measure of association. The final choice should rest most heavily on 
consideration of the type of relationship between variables that is of 
interest. 

Measures of association are intended to describe the degree of rela- 
tionship between two variables and are usually defined to be +1.0 (or 
-1.0) for a perfect predictive relationship and 0.0 for no relationship. 
Each measure of association is designed for a different type of relation- 
ship. For example, since Pearson’s r is designed to measure the degree 
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of linear relationship, r - 1 only for perfect linear relationships, 
and r = 0 cannot be used to infer the "independence" of the two vari- 
ables in the population; it merely indicates no linear relationship in 
the sample. Many different kimis of relationships are possible between 
two variables: (a) a linear relationship — the relationship between a 

pretest and a posttest IQ score might be expected to be linear; (b) a 
monotone relationship — e.g., average weight increases with height, how- 
ever, the average difference in weight for two inches' difference in 
height may be different at heights near 30 inches than at heights near 
72 inches; (c) general association — e.g., small-group discussions occur 
much more frequently in connection with social studies lessons, whereas 
individual work is more often associated with math lessons. 

In selecting a measure of association it is most important to have 
in mind the type of relationship that might be expected to occur or that 
would be of most interest. A first step in this selection procedure is 
to arrange the data in a scatterplot or two-way table so that a visual 
assessment can be made of the relationship between the two variables 
and of any peculiarities in the data that might invalidate the choice 
of a particular measure of association. 

Some measures of association, such as Pearson's r, are said to be 
symmetric in the two variables. That is, regardless of whether the 
prediction is x from y or y from x, the measure of association, r, is the 
same. Other measures, like the correlation ratio n or the measure X, 
are said to be asymmetric ; that is, under these models, prediction of 
y from x might be more accurate than prediction of x from y, and thus 
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n 1 rf in general. Unless otherwise noted, measures of associ- 
y*x x*y & 

ation are usually symmetric in the two variables. 

Measures of association can be defined for a population or a sample. 
For brevity we give only the formula for the sample measure; the reader 
interested in the corresponding population measures should refer to ti)j* 
literature f r ? ted for each measure. If a sample has been selected at 
random from some population, then the sample measure may be used to make 
inferences about the value of the population measure. Tills assumption 
of random sampling is implicit in all tests of the statistical signifi- 
cance of a measure of association/ Other assumptions necessary for 
testing the statistical significance of an observed degree of associ- 
ation will be discussed in conjunction with each particular measure. 

Continuous or Rank-Ordered Variables 

If both variables are continuous, the standard measure of associ- 
ation is Pearson’s r. Two noaparametric measures, Spearman’s r and 
Kendall’s tau are also applicable in this case. Two hypothetical exam- 
ples will be used to illustrate these measures. 

In a research study, several teachers were observed in their class- 
rooms over a six-week period. Among other things, the observers coded 
negative feedback (e.g., criticism, rebuke) * given by these teachers to 
18 randomly chosen students, and, at the end of the time period, a* 
self-esteem assessment scale was administered to the students. The data 
are shown in Table 1 and Figure 1. 

For the second example, shown in Table 2 and Figure 2, an instructor 
in an educational psychology course wished to use the class results on 
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TABLE 1 

Student Self-Esteem and Rate of Negative Feedback from the Teacher 
(Average Negative Responses per Pupil Classroom Hour) 



Student 


Negative - 

Self-esteem feedback rate 


Student 


Negative - 

Self-esteem feedback rate 


1 


44 


4.13 


10 


64 


2.75 


2 


70 


3.38 


11 


100 


1.25 


3 


110 


.38 


12 


34 


5.88 


4 


42 


5.63 


13 


74 


2.50 


5 


68 


1.88 


14 


32 


7.00 


6 


88 


2.36 


15 


120 


.64 


7 


72 


2.12 


16 


62 


3.82 


8 


90 


1.00 


17 


56 


3.00 


9 


102 


.85 


18 


86 


1.50 



3 

ERIC 



Self-esteem 

scale 



120 



100 • 



80 



60 



40 



20 



12 3 4 5 6 7 

Average negative responses per pupil classroom hour 

Fig. 1. Scatterplot of self -esteem vs. negative-feedback rate. 
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TABLE 2 



Scholastic Aptitude Subtest Results for Twenty-Four Students 
(Possible Score of 800) 



Student 


Verbal 


Mathematical 


Student 


Verbal 


Mathematical 


i 


770 


690 


13 


800 


770 


2 


540 


480 


14 


740 


790 


3 


610 


540 


15 


570 


660 


4 


630 


510 


16 


680 


720 


5 


590 


640 


17 


580 


610 


6 


700 


540 


18 


660 


490 


7 


650 


530 


19 


450 


400 


8 


510 


380 


20 


610 


460 


9 


610 


680 


21 


560 


550 


10 


520 


420 


22 


620 


6 SO 


11 


690 


590 


23 


610 


680 


12 


670 


760 


24 


660 


650 



Verbal 



300 

700 

600 

500 

400 

300 
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; athematical 




T 'ig. 2. Scatterplot of verbal vs. mathematical SAT scores. 



i 

| 



V 



9 



the subtests of the Scholastic Aptitude Test to demonstrate that the two 
test sections usually have a correlation near .64. 

The observations on the two variables will be denoted and y ^ for 
individuals i = 1, ..., n. 

Pearson Product-Moment Correlation 

The most commonly used measure of association for two continuous 
variables is the Pearson product-moment correlation. The population mea- 
sure is usually denoted by p and the sample measure by r. Pearson’s r 
is designed for situations in which the relationship between variables 
x and y is linear. The measures p and r will be +1.0 for a perfect lin- 
ear relationship with positive slope (see Figure 3a) and -1.0 for a per- 
fect linear relationship with negative slope (Figure 3b). They will be 
zero if there is no linear relationship (see Figure 3 c and d) . 

The population measure p is defined as 



( 1 ) 



Covariance (x,y) 

a a 
x y 



E(X - u ) (Y - u ) 

* y 



|e(x - 



v 2 «* - y 2 ] 1 ' 2 



The sample statistic r is 



( 2 ) 



r = 



l (x ± - x) (y ± - y) 

i 

|(x ± -X ) 2 • l(y t - T)^ n 



i 



For the self-esteem example from Table 1, r = -.924. This means 
there is a close linear relationship between negative-feedback rate and 
self-esteem, but in a negative direction. As negative responses increase, 
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(a) r = + 1.0 



(b) r = - 1.0 



• • 



• • • 



(c) r = 0.0 



<d) r = 0.0 



15 



'e) r = .75 



O 

ERIC 



Fig. 3. Scatterplots showing different values of r. 
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self-esteem decreases. For the data in Table 2, r = .692, representing 
a fairly high, positive degree of linear relationship. It is also well 
within the instructor’s anticipated limits. 

To make inferences about the population measure p, using r, it must 
be assumed that the sample observations were randomly selected from the 
population and that x and y have a bivariate normal distribution. Bi- 
variate normality implies that both x and y have normal distributions 
and that the relationship between x and y is linear. For tests and con- 
fidence intervals about the value of p, see, e.g. , Dixon and Massey (1969) 
or Hays (1963). For discussions of the many factors affecting the size 
of r, see Carroll (1961) or Walker and Lev (1969). 

In cases where the two variables can be assumed to have a bivariate 
normal distribution and for some reason one or both variables have been 
dichotomized but the investigator still wishes to make inferences about 
the population value of p, the measures tetrachoric r, point biserial 
r, (}), and biserial r have been developed. These are discussed later in 

in this study. When the variables have been categorized into more than 
two categories, estimation methods for p using the polychoric series 
method have been developed by Lancaster and Hamdan (1964). 

Pearson’s r is designed for situations in which the relationship 
between variables is linear, and for inferences about p to be valid the 
joint distribution of x and y must be bivariate normal. If a monotone, 
but not necessarily linear, relationship is of interest, or bivariate 
normality is unlikely, the nonparametric measures of association, Spear- 
man’s rank correlation coefficient or Kendall’s tau, should be considered. 
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Spearman *5 Rank-Correlation Coefficient 

Spearman’s r is designed to measure the degree of monotone relation 
ship between two variables x and y. Instead of using the exact score 
values, the observations are ranked from lowest to highest on each vari- 
able separately and then Pearson’s r is calculated on the ranks. When 
two variables have a monotone relationship, their ranks will have a lin- 
ear relationship. 

For convenience, the mathematically equivalent computing formula 



(3) 



r = 1 - 



61 Pj[ 

n(n 2 - 1) 



may be used. The number is the difference between the x and y ranks 
th 

for the l individual. 

Spearman’s r will be +1.0 for perfect positive monotone relation- 
ships such as those shown in Figure 3 a and e, -1.0 for perfect negative 
monotone relationships, and 0.0 if there is no relationship or the re- 
lationship is not monotone (Figure 3 c or d) . 

For the self-esteem data shown in Table 1, the rank scores are 
shown in Table 3 and the rank scat terp lot in Figure 4. The obtained 

r = -.948 is very close to the Pearson’s r * -.924. 
s 

Sometimes observations will be tied as in the mathematical and ver- 
bal scores from Table 2. If the number of ties is small, the midranks 
of the tied observations can be used (see below) and formula (3) can 
still be applied. If the number of tied observations is large, then the 
reader should refer to the formula given in Kendall (1970). 

The rankings for the data of Table 2 are found in Table 4 and Fig- 
ure 5. Midranks are assigned by averaging the rank positions which tied 
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TABLE 3 



Ranked Scores on x (Negative Feedback) and y (Self-Esteem) for 
Eighteen Students Arranged in Order by Rank on x 



Student 




x ranks 


y ranks 


D i 


14 




1 


18 


-17 


12 




2 


17 


-15 


4 




3 


16 


-13 


1 




4 


15 


-11 


17 




5 


12 


-7 


16 




6 


14 


-8 


10 




7 


11 


-4 


5 




8 


7 


1 


2 




9 


13 


-4 


7 




10 


8 


2 


13 




11 


16 


1 


18 




12 


6 


6 


6 




13 


9 


4 


8 




14 


4 


10 


11 




15 


5 


10 


9 




16 


3 


13 


3 




17 


1 


16 


15 




18 


2 


16 


r = 1 - 
s 


el d l 

n(n 2 ■ 


1 - 

- 1) 


11328 . i _ i 94a _ 
5814 1 1,948 ” 


- .948 
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2 4 6 8 10 12 14 16 13 



Negative-feedback rate 



Fig. 4. Rank scatterplot for self-esteem data. 
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TABLE 4 

Twenty-four Students Ordered by Rank on the 
Mathematics Sub test x 



Student 


x ranks 
(mathematical 
sub test) 


y ranks 
(verbal) 


14 


1 


3 


13 


2 


1 


12 


3 


7 


16 


4 


6 


1 


5 


2 


9 


7.0 


14.5 


22 


7.0 


12 


23 


7.0 


14.5 


15 


9 


19 


24 


10 


8.5 


5 


11 


17 


17 


12 


18 


11 


13 


5 


21 


14 


20 


3 


15.5 


14.5 


6 


15.5 


4 


7 


17 


10 


4 


18 


11 


18 


19 


8.5 


2 


20 


21 


20 


21 


14.5 


10 


22 


22 


19 


23 


24 


8 


24 


23 




6(832.5) _ x 


4995 - 638 


r = i “ 
s 


24 (24 2 - 1) 


13800 * 638 
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Verbal 




Fig. 5. Rank scatterplot of SAT data. 
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observations would have if they were distinguishable. In Table 3, stu- 
dents 9, 22, and 23 are tied on mathematics (x) scores. If their scores 
were distinguishable, these three would hold ranks 6, 7, and 8. Since 
they are tied, they are all assigned the midrank (6 + 7 + 8) / 3 = 7.0. 
Similarly, students 3 and 6 would hold ranks 15 and 16, but because they 
are tied, both are given rank (15 + l6)/2 = 15.5. For the ranks of 
Table 4, r g = .638 as compared to Pearson's r = .692. 

If the observations represent a random sample from some population, 

the null hypothesis that x and y do not have a monotone relationship 

(that is, that their ranks do not have a linear relationship) can be 

tested. Use of r to make inferences about association in the population 
s 

requires that observations can be ranked without ties, so x and y must be 
continuous variables. For small n (n <. 30) , tables of the critical values 
of r g may be found in Siegel (1956). For large n, tables and tests for 
Pearson's r provide approximate tests of the significance of r g . 

For measures of association designed for continuous variables, the 
occurence of ties can affect the validity of the significance tests. If 
the proportion of ties is small, the method outlined above should be 
reasonable. Other approaches to the handling of ties are possible, for 
example, see the section on "ambiguous data" in Bradley (1963). If the 
proportion of ties is not small, one of the association measures designed 
for categorical variables should be considered instead. 

Kendall's Tau 

Like Spearman's r g , Kendall's tau, T, is a nonparametric rank mea- 
sure of association which measures the degree of monotone relationship 
between two variables; T, however, is derived from different principles. 
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Kendall’s T is +1.0 for a perfect positive monotone relationship between 
x and y, -1.0 for a negative relationship, and 0.0 when no monotone re- 
lationship exists. The measure T is based on the idea of examining all 
possible pairs of observations and recording for each whether the rela- 
tive ranks assigned to x agree with those assigned to y. For example, 
for students numbered 14 and 12 from Table 3, the x ranks are 1 and 2 
and the y ranks are 18 and 17 respectively. Thus, these two individuals 
are ranked in the opposite order by variables x and y and provide an 
instance of disagreement between the two rankings. Students numbered 
17 and 16 provide an instance of agreement between the two rankings. 
After all pairs of observations are examined, the difference between the 
total number of agreements (P) and the total number of disagreements (Q) 
is compared to the maximum possible number of agreements (n(n-l))/2. 
Thus, 



(4) 



x = p - q 

n(n-l) 

2 



Consider the data from Table 3. Since there are 18 cases, the total 

number of pairs is (^) = ( n -2) ! 2 1 — * ~~ ^2 ~ = 2 = ^ en there 

are no ties it is only necessary to count P or Q, but not both, because 
p + q - . That is, the number of pairs whose ranks disagree and 

the number of pairs whose ranks agree must add up to the total number of 
pairs. In the example, the student pairs numbered (17,16), (17,2), 
(10,2), (5,2), (5,13), (5,6), (7,6), (18,6), (8,11), (3,15) agree in 
x and y rank order. Thus P * 11 and it follows that Q * 153 - 11 * 142. 
The result is T - 11 1 ^ 3 1 - 4 - = * - .856. This indicates a high degree 
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of negative mono tonicity i note that T is somewhat smaller than Spearman's 
rank correlation or the Pearson product-moment correlation. 

Assuming that the individuals represent a random sample from some 
population in which x and y are continuous variables, T may be used to 
test the null hypothesis of no monotone relationship between x and y. 

The test is based on the asymptotic normality of S = P - Q. For large n, 
the variance of S is 

(5) °S = ll nfn_1) (2n+5) 

Thus if x and y have no monotone relationship, 



has approximately a standard normal distribution, and the null hypothesis 
would be rejected if P - Q > Og z j_ a /2 or P " Q < °g z a /2’ ^ ee Ken< * a ll 
(1970) for additional details and an explanation of the continuity cor- 
rection that should be used when n is small. 

Kendall's tau requires that the variables be continuous so that no 
ties in the ranks can occur. Because of measurement problems, however, 
ties in ranking often do occur, and a special variation of tau can be 
used. It is 

p - Q 



( 6 ) 



jTnCn^ - T x ) • (nCn^) - I y J] 



1/2 



where T is the number of pairs in which x scores are tied, T is the 
x y 

number of pairs in which y scores are tied, and T is the number of 
pairs in which both the x and y scores are tied. Note that 



(7) 



(“)=P + Q + T + T -T 
2 ^ x y xy 
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For computation of x^, consider the SAT scores for the 24 students 
shown in Table 4. Note that there are 24*23/2 = 276 different pairs of 
students, and there are four tied pairs in the x ranks, three from the 
7.0's and one from the 15. 5' s. In general, if there are m cases with 
the same rank, these cases generate tied pairs. Ties are not 

counted as correctly or incorrectly classified and hence are ignored 
when calculating P and Q. That is, when comparing students 13 and 12, 
the x and y rankings agree, but when comparing students 24 and 18, the 
order is neither correct nor incorrect. In this way, Q = 67 and P = 199, 




T = 1. 
xy 



T b = 



199 - 67 

[ (276 - 4) (276 - 7)] 



1/2 



132 

(272*269) 



132 

1/2 " 270.5 



= .488 



See Kendall (1970) and Goodman and Kruskal (1972) for significance tests 
of T, . 

D 

For data with tied observations, Kendall also introduces T , the 

a 

average of the values of T obtained by all possible different rankings 

of tied observations, A variation of T called T can be used for con-* 

a c 

tingency tables where the number of rows is large relative to the number 
of columns or vice versa. 

Summary * * 

Three measures of association for use when both variables are con- 
tinuous have been discussed: Pearson’s r, Spearman’s rank correlation 

coefficient, and Kendall’s tau, Pearson’s r is a measure of linear asso 
ciation and requires the assumption of bivariate normality for valid 
tests of significance, Spearman’s r g and Kendall’s T measure the degree 
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of monotone association (or linear association of ranks) . For the data 
in Table 1, r = - .92, r - - .95, T = - .86. For the data in Table 2, 
r = .69, r g = .64, = .49. 

The measures r and T may be used when the original scores have 
s 

resulted simply from two different rank orderings; measurements on an 
interval or ratio scale are unnecessary. 

Categorical Variables 

Association measures designed for the case where both variables are 

categorical include Goodman and Kruskal's Y* Somers 1 d, Lambda (X), the 

2 

uncertainty coefficient u, and a variety of measures based on X ( chi 
square). 

When both variables are categorical, the data is usually displayed 
in a classification table, such as Table 5. In this hypothetical exam- 
ple, two teachers were asked to estimate the achievement potential for 
78 tenth-grade students who had signed up for a French language course. 
Their ratings were confined to the three categories of below average, 
average, and above average. 

TABLE 5 

Agreement on French Achievement Potential by Two Classroom Teachers 



Teacher 1 - y 

Below Above 

average Average average Totals 



Teacher 2 
= x 



Below 

average 

Average 

Above 

average 



12 


5 


2 


19 


3 


34 


6 


43 


1 


7 


8 


16 


16 


46 


16 


78 



Totals 
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The values of the x variable, ratings by Teacher 2, are used to 

define the rows of the table, and the values of the y variable to define 

the columns. The number of individuals receiving score number i on the 

, . . .th 

x variable and score number j on the y variable is entered in the ij 
" cell" of the table and denoted by f^. Thus, the frequency of indi- 
viduals rated as having average potential by Teacher 2 and as having 
above-average potential by Teacher 1 is ^23 = Row totals are denoted 
by f i# and column totals by f.^; the grand total is f tt = n. For the 

data of Table 5, f .2 “ ^ and n = 78. 

Categorical variables may be ordered as in Table 5 or unordered as 
in variables such as field of study (major field). Some measures, such 
as y, are useful when categories are ordered; others, like those based 
on a nd X, ignore any ordering of categories. For situations in which 
both categories of cross-classification are ordered, measures of asso- 
ciation designed for continuous measures could be used by assigning 
scores to each category and using a procedure which handles ties well. 
Gamma 

The measure y is basically a version of Kendall's T developed 
for the case where the number of tied observations is large. This is 
the situation in a typical contingency table. For example, in Table 5 
there are only three possible rankings for 78 students, and therefore 
It is impossible for each to have his own distinct rank. In fact, a 
great number are put in the same category by any one rater and are, 
hence, "tied." Gamma was developed by Goodman and Kruskal (1954) as a 
symmetri c measure of monotone association for ordered categorical vari- 
ables. Gamma is estimated by 
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( 8 ) 



c . *-^a 

P + Q 



where P and Q have the same meaning as for Kendall's tau. When there are 
no ties, P + Q = (n(n-l))/2 and G = T. Note that tied observations are 
not considered in the calculation of G. 

Figure 6 a, b, and c gives examples of tables that produce perfect 
monotone association (G = 1) . Tables with patterns like these with rela- 
tively low frequencies in the "zero cells" will lead to high values of G. 
The value of G is zero in the table in Figure 6d in which the pattern 
of nonzero cells is not monotone. 



f ll 


0 


0 


0 


f 22 


0 


O 

« 


0 


f 33 



(a) G = 1.0 



(b) G = 1.0 



ki 


f 12 


0 ; f n 

i 


1 

0 


i 

0 


\ 

0 


f 22 


f 23 


f 21 


0 


0 


0 


0 


f 33 . 


f 31 


f 32 


f 33 



(c) G = 1.0 



3 



Fig. 6. 



(d) G - 0.0 

Tables illustrating various values of G. (Note that 
all cells marked f in table d have the same frequency.) 
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The continuous SAT data of Table 4, where the number of ties is 

small , yields 

_ 199 - 67 = 132 = 

G " 199 + 67 266 

which is close to = .521. 

An algorithm for the calculation of G in contingency tables is 
illustrated for the data of Table 5. To get P, multiply every cell 
frequency f^. by the sum of cell frequencies for all cells which lie to 
the right of and below that cell, and add these results. For Q, the 
same procedure is applied to cells to the left and below . 

P = 12(34 + 6 + 7 + 8) + 3(7 + 8) + 5(6 + 8) + 34(8) 

Q = 5(3 + 1) + 34(1) + 2(3 + 34 + 1 + 7) + 6(1 + 7) 

P = 1047 



Q = 192 

_ 1047 - 192 



855 



= .690 



1047 + 192 1239 

Simple inspection of the data matrix in Table 5 indicates monotone 

agreement between the two teachers. In this situation, = .484, demon- 
strating how different G and may be when the number of ties is large. 

Asymptotic tests and confidence intervals for the value of y may be 
based on the asymptotic normal distribution of G. For details, see 
Goodman and Kruskal (1963, 1972); these results are not included here 
because of the tedious calculations involved. However, conservative 
asymptotic procedures for the case where the observations constitute a 
simple random sample from the population may be based on the assumption 
that G has approximately a normal distribution with mean y and estimated 
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variance 



(9) 



2 2n(l ~ G ) 
P + Q 



S G = 



where G is given by (7). 

Somers' d 

Closely related to y and T are the asymmetric measures developed by 
Somers (1962). When predicting y from x is of interest, 



( 10 ) 



d Lji_2 

°yx P + Q + T y - 



where P and Q are the same as for Kendall's tau, T y is the number of 

pairs tied in y, and T is the number of pairs in which both x and y 

xy 

are tied. 

A d for the symmetric case given by Anderson and Zelditch (1968) 
is exactly the same as 

, 



(ID 



l(P + Q + T y - T xy )(P + Q + T x - T^y) ] 



1/2 



The relation is = d yx d xy = d . Remember that ( 2 ) = P + Q + T x +T y - 

Significance tests are based on S = P - Q as for T; see also Goodman and 
Kruskal (1972). 

Lambda 

The X measures, both symmetric and asymmetric, were developed by 
Goodman and Kruskal (1954) as measures of predictive association, dif- 
ferent in concept from the measures of monotone or linear association 
discussed so far. The basic purpose of X y is to measure the degree of 
success with which an individual's x value may be used to predict his 
y value. The prediction procedure when x is known is to pick the y 
value that has the highest frequency for that value of x; no use is 
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made of any ordering of the actual score values. Using such a procedure 
one can define the probability of making an error in prediction when x 
is known, P(error|x known), and the probability of making an error in 
prediction when x is unknown and the value of y with the highest fre- 
quency overall is predicted. Then X^ is defined as the proportionate 
reduction in the probability of making an error owing to the knowledge 
of x. 

_ P(error|x unknown) - P(errorlx known) 
y “ P (error | x unknown) 

To calculate L y from a sample, the following argument is used. 
Suppose the relationship between x and y in the sample has been observed 
as in Table 5. We are then asked to guess the y score of an individual 
selected at random from the n individuals represented in the table. 
Without knowing that individual ' s x score, the best procedure is to pre- 
dict the y value with the highest frequency. Define m as the index for 
which f.j is a maximum, 

max f = f 

j J 

and then predict the score value y corresponding to the index m. For 
the data of Table 5, suppose we consider using Teacher 2's assessment 
to predict Teacher l's assessment. Without knowing Teacher 2's rating, 
we would predict that Teacher 1 would classify a student as average 
since max f = 46, which occurs for j « 2. The number of prediction 

j * j 

errors made by using this procedure is 

(13) Number of errors when x unknown * n - max f, . = n - f. 



j 



•m 
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Suppose, however, that we are allowed to utilize the individual’s 
x score in making the prediction. If x has index i, we predict the 
score value of y corresponding to max f. that is, pick the value of y 

j 3 

with the highest frequency for that value of x. Again define 

f . = max f . . 

lm . in 

J J 

then, for the data in Table 5, max f. = 12, max f_, = 34, max f„. = 8. 

j 3 j 3 j 3 

The number of errors made using this procedure is n - ) f , , which is 

J im 

78 - (12 +34 + 8) = 24 for the data in Table 5. Then is the propor- 
tionate reduction in the number of errors when x is utilized. 

(14) L = dumber errors when x is unknown - Number of errors when x is known 
y Number of errors when x is unknown 



n - max f - [n - £ max f ] 

j 3 i -1 3 



n - max f 



j 



If. -f 

t im • 

i 



m 



n - f 



>m 



where f is the largest frequency from the column totals and £ is 



im 



the sum over all rows of the largest cell frequency in each row. 

For the data in Table 5, L y = (12 + 34 + 8 - 46)/ (78 - 46) * 8/32 = .25. 
Note that makes no use of any ordering of the actual score values of 
either x or y so it does not measure monotone association. 

Tables such as those shown in Figure 7 illustrate the kinds of 
association yielding large or small L^. Note that in general the values 
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of may be expected to be quite different from those of other measures 
of association. (For example, tables c and f in Figure 7 yield G = 1 
and table d in Figure 7 yields G * 0). 




Fig. 7 
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Tables illustrating various values XL . (All cells marked f 



have the same frequency.) 
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For predicting x from y, is defined in an analogous manner. 

When a symmetric measure of association is desired (the direction of the 
prediction is not important), the composite measure L can be defined 



(15) 



l max f . . + 7 max f . . - max f . - max f . 

_j i 1] i i 1] i 1- j 



2n - max f. - max f . 

i 3 



Calculating L from the data of Table 5 yields these results: 

T max f = 12 + 34 + 8 = 54 

3 i 1] 

y max f . . * 12 + 34 + 8 « 54 

i j 1J 

max f . = 43 

i* 
i 



max f . = 46 

j * J 

_ 54 + 54 - 43 - 46 19 ' , ft , 

2(78) - 43 - 46 " 67 " * 284 

Note the large difference between G = .7 and L = .3. Because L is 
not a monotonicity measure like G but rather an index of predictive 
association, there is in general no reason to expect similar results. 

To make tests or confidence intervals on the value of. X , when the 

y 

observations constitute a simple random sample, we may use the fact that 
Ly is asymptotically normal with mean X^ and variance estimated by 



(16) 



(n - l f )(T f, + f - 2 T r f, ) 
“ im “ ?m *m L 



(n " f .J‘ 

•m 



where y r f. denotes summation of the maximum frequency in a row only 



over those rows in which f. falls in the same column as f . So, for 

im *m * 
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the data in Table 5, fj. = 12, f 2m = 34, f 3m = 8, and the column in 

which f occurs is column 2 (f = 46); thus, £ r f. = 34. 

•m *m im 

These properties of asymptotic normality hold under the assumptions 
that a random sample of size n has been drawn from the population, that 
X is not equal to zero or one, that in the population the maximum pro- 

y 

portions p^ and p # ^ are unique and P. m ^ 1*0. Significance tests and 
confidence intervals for A^ and X are derived in a similar fashion , see 
Goodman and Kruskal (1963, 1972) for complete details. 

Uncertainty Coefficient 

The uncertainty coefficient u, sometimes called the coefficient of 
constraint, was developed by information theorists as an asymmetric or 
symmetric measure of association based on the reduction of uncertainty 
about one variable when the other variable is known. Thus it is concep- 
tually similar to X, 

An explanation for the asymmetric case is developed below. Suppose 
we want to predict the value of y. The "uncertainty" about an individual’s 
y value when x is unknown depends on the marginal distribution of y and 
is defined as 

f # . f . 

(17) U(y) = - l log(-U-) 



The base of the logarithm is arbitrary, but base two is frequently used 
following the lead of Claude Shannon, the information-theory pioneer, who 
originally defined (17) as a measure of entropy. 

When x is known 



(18) 



U(y|x) = - l l -jp- logOjr^-) 

i j i* 
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The uncertainty coefficient denoted here by when predicting y from x 
is the reduction in uncertainty due to knowledge of x: 

(19) 



„ _ u(v) - u(vlx) 
u y ' U(y) 



Note the similarity to the definition of X^. 



For the data in Table 5, using logarithms to the base 2 , 

.16 . 16 . 46 

l 78 log 78 + 78 



^ r!6 \ 16 . 46 . 46 . 16 . 16, . _ B ,, 

U(y) = - [,« log 7H + 7R log 78 + 78 log 78 1 = 1,3865 



| , r 12 . 12 . 3 . 3.1. 1 

u (yl x > = -1 78 log 19 + 78 log 43 + 78 log 16 



, 5 . 5 . 34. 34 7 . 7 

+ 78 log 19 + 78 log 43 + 7R log 



+ 78 log 19 + 78 log 43 + 7fl log 1.0822 



78 

_8 

78 



16 



16 J 



and 



u = .219 

y 



This value is reasonably close to L * .25 for the same data. 



( 20 ) 

where 

( 21 ) 



A symmetric version of the uncertainty coefficient is 
u = U(v) + U(x) ~ U(v,x) 



U(y) + U(x) 



f, . f, . 

u(y»x) = - I I -r 1 log (—4-) 

i j 



For more information and significance tests, see Attneave (1959). 

Chi Square < 

The chi-square test is commonly used to test for association between 
categorical variables. The chi-square test ignores ordering of the vari- 
ables; that is, it is insensitive to the score values of x and y« The 
2 

X statistic itself cannot be used as a measure of association since it 
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